Methyl CpG binding domain nucleic acids from maize

ABSTRACT

The present invention relates to methyl binding domain nucleic acids and polypeptides isolated from  Zea mays L.

RELATED APPLICATION INFORMATION

[0001] This application claims priority from U.S. Ser. No. 60/218,745 filed on Jul. 17, 2000.

TECHNICAL FIELD OF THE INVENTION

[0002] The present invention relates to molecular biology. More specifically, the present invention relates to the isolation and identification of certain Methyl CpG binding domain nucleic acids from Zea mays L. (maize).

BACKGROUND OF THE INVENTION

[0003] DNA methylation is often associated with transcriptional silencing of gene expression. The molecular mechanisms are not defined by which the signal of DNA methylation is transduced into a change in gene expression pattern. Evidence indicates that distinct mechanisms may be involved in the silencing of gene expression by DNA methylation (reviewed by Kass et al., Trends Genet., 13: 444-449 (1997)). DNA methylation has been found to reduce the affinity of transcriptional activators for DNA regulatory sequences in some examples (reviewed by Eden and Cedar, Curr. Opin. Gen. Dev., 4:225-259 (1994)). However, in other cases, DNA methylation had no effect on binding affinities (Eden and Cedar, Curr. Opin. Gen. Dev., 4:225-259 (1994)). A second mechanism for the repression of gene expression by DNA methylation involves proteins that bind specifically to methylated DNA. These proteins may be transcriptional repressors or they may interfere with transcription by blocking access to DNA for transcriptional activators (reviewed by Bird and Wolffe, Cell, 99: 451-454 (1999)).

[0004] Two complexes with the ability to specifically bind to DNA containing 5-methylcytosine, MeCP1 and MeCP2, were identified in mammals (Meehan et al., Cell 58: 499-507 (1989); Meehan et al., Nucl. Acids Res., 20:5085-5092 (1992)). MeCP1 is a set of several proteins while MeCP2 is a single protein. Cloning of the MeCP2 protein revealed the presence of a domain of 60 amino acids, which is necessary and sufficient for binding to methylated DNA (Nan et al., Nucl. Acids Res., 21:4886-4892 (1993)). The identification of the methyl-CpG-binding domain (MBD) facilitated in silico searches for putative methyl-CpG binding proteins.

[0005] In mammals, five proteins, MeCP2, MBD1, MBD2 (part of MeCP1 complex), MBD3, MBD4, containing a MBD have been identified (Hendrich et al., Nature, 401: 301-304 (1999)). These proteins do not share any homology outside the MBD (except MBD2 and MBD3). MeCP2, MBD1, MBD2 and MBD4 all bind to methylated DNA. Although it contains a putative methyl-CpG-binding domain, MBD3 does not show the ability to specifically bind to methylated DNA (Hendrich, et al., Mol. Cell Bio., 18: 6538-6547 (1998)). Evidence indicates that the methyl-CpG-binding domain proteins isolated from mammals can perform three distinct biochemical funtions: DNA repair, demethylation, and transcriptional repression.

[0006] Spontaneous deamination of cytosine results in the presence of a G-U mismatch. This mismatch is processed by uracil glycosylase, which results in the proper repair. Deamination of 5-methylcytosine yields thymine which results in a G-T mismatch which is not repaired by uracil glycosylase. MBD4 (also referred to as MED 1) contains a MBD in the N-terminal portion of the protein and a glycosylase domain in the C-terminal portion (Hendrich et al., Nature, 401: 301-304 (1999)). ^(m)CpG×TpG mismatches were found to be the preferred binding site of MBD4. Upon binding, MBD4 catalyzes the removal of the thymidine, which will result in the proper repair of the mutation. The presence of MBD4 allows organisms to reduce the C→T transition rates induced by 5-methylcytosine. Many human carcinomas with microsatellite instabilities involve mutations in MBD4 (Riccio et al., Nat. Genet., 23: 266-268 (1999)).

[0007] Mammals contain a system that actively demethylates DNA (Wolffe et al., Proc. Nat. Acad. Sci. USA, 96: 5894-5896 (1999)). In one study, MBD2b was found to possess demethylase activities (Bhattacharya et al., Nature, 397: 579-583 (1999)). However, other studies failed to associate any demethylase activity with MBD2. Further biochemical studies are underway to determine if MBD2 is truly a demethylase enzyme or not. Another study found that MBD2 was a part of a MeCP1 complex (Ng et al., Nat. Genet., 23: 58-61 (1999), Wade et al., Nat. Genet., 23:62-66 (1999)). The MeCP1 complex is capable of repressing transcription from methylated promoters. This complex, which is present in somatic tissues but absent in embryonic stem (ES) cells, may play a role in developmental regulation of gene expression (Meehan et al., Cell, 58: 499-507 (1989)).

[0008] MBD1 and MeCP2 are both involved in active repression of transcription, but the exact mechanism of repression is distinct for the two proteins. MBD1 was originally proposed to be the component of the biochemically purified MeCP 1 complex responsible for methyl-CpG-binding activity (Cross et al., Nat., Genet. 16: 256-259 (1997)). However, MBD1 antibodies do not detect any of the MeCP1 complex proteins in Western blots (Ng et al., Mol Cell Bio., 20: 1394-1406 (2000)). MBD1 is involved in methylation dependent transcriptional repression of the tumor suppressor genes p16, VHL and E-cadherin both in vitro and in vivo (Fujita et al., Mol. Cell Bio., 19: 6415-6426 (1999)). As is the case for MBD2, the transcriptional repression by MBD1 requires histone deacetylation (Ng et al., Mol. Cell Bio., 20:1394-1406 (2000)). However, the deacetylase dependent pathway utilized by MBD1 is different from the pathways used by MBD2 and MECP2 (Ng et al., Mol. Cell Bio., 20:1394-1406 (2000)).

[0009] The presence of a single 5-methylcytosine base is sufficient to induce binding of MeCP2. High densities of 5-methylcytosine will result in strong interactions with MeCP2 even when the DNA is packaged into nucleosomes (Nan et al., Cell, 88:471-481 (1997); Chandler et al., Biochemistry, 38: 7008-7018 (1999)). MeCP2 represses transcription via two distinct mechanisms. A transcriptional-repression domain (TRD) that interacts with mammalian Swi-Indepenent 3a (mSin3A) is found from amino acid 207-300 of MeCP2 (Nan et al., Nature, 393:386-389 (1998)). mSin3a is part of a corepressor complex containing HDAC 1 and HDAC2. The histone deactylation dependent pathway utilized by MeCP2 is distinct from that utilized by MeCP1 (Ng et al., Mol. Cell Bio., 20:1394-1406 (2000)). MeCP2 can also repress transcription through a histone deactylation-independent pathway (Yu et al., Nucl. Acids Res., 28:2201-2206 (2000)). Targeted deletion of MeCP2 in mice ES cells does not affect ES cell proliferation but embryos containing mutant ES cells fail to gastrulate and abort between embryonic day 8.5 and 12 (Tate et al., Nat. Genet., 12:205-208 (1996)). Mutations in the X-linked human MeCP2 gene result in Rett syndrome (Amir et al., Nat. Genet., 23:185-188 (1999)). Rett syndrome is one of the most common causes of mental retardation in females. Patients with Rett syndrome develop normally until 6-18 months of age and then display numerous phenotypes (Amir et al., Nat. Genet., 23: 185-188 (1999)). Mutations in MeCP2 are male inviable in humans.

[0010] Proteins containing a methyl-CpG-binding domain are important for many biological processes in mammals including DNA repair, transcriptional silencing and possibly demethylation. The MBD domain proteins are critical in mediating the effects of DNA methylation. In flowering plants, DNA methylation has been correlated with many epigenetic and developmental phenomena [R paramutation (reviewed by Kermicle, J., “Epigenetic silencing and activation of a maize r gene.” In: Russo V, Martienssen R, Riggs A. (eds.) Epigenetic Mechanisms of Gene Regulation, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp 267-287 (1996)), transposable element silencing (Chandler, et al., Proc. Natl. Acad. Sci. USA, 83:1767-1771 (1986), Banks et al., Genes and Dev., 2:1364-1380 (1988), Schwartz et al., Mol. Gen. Genet. 205, 476-482 (1986)), transgene silencing (reviewed by Matzke and Matzke, Plant Physiol., 107:679-685 (1995)), and Superman and Agamous regulation (Jacobsen et al, Curr. Biol., 10: 179-186 (2000)]. It is not known whether DNA methylation plays a causative role in any of these phenomenon or if it is simply an affect of a change in epigenetic state. The effects of DNA methylation are likely mediated by plant MBD proteins in a manner similar to that of the MBD proteins in mammals. Therefore, there is a need in the art for the identification and characterization of Methyl CpG binding domains in plants.

SUMMARY OF THE INVENTION

[0011] In one embodiment, the present invention relates to an isolated and purified nucleic acid comprising a polynucleotide selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3 and conservatively modified and polymorphic variants thereof. In addition, the present invention relates to an isolated and purified nucleic acid comprising a polynucleotide having at least 60%, 70%, 80%, 90%, or 95% identity to a polynucleotide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO: 3.

[0012] In yet another embodiment, the present invention relates to an isolated and purified polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4 and conservatively modified variants thereof. In addition, the present invention relates to an isolated and purified polypeptide comprising an amino acid sequence having at least 60%, 70%, 80% or 95% identity to an amino acid sequence selected from the group consisting of: SEQ ID NO: 2 and SEQ ID NO: 4.

[0013] In yet a further embodiment, the present invention relates to an expression cassette containing a promoter sequence operably linked to an isolated and purified nucleic acid comprising a polynucleotide selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3 and conservatively modified and polymorphic variants thereof. Preferably, the expression cassette also contains a polyadenylation signal which is operably linked to the previously described nucleic acid. Examples of promoters which can be used in the expression cassette include constitutive and tissue specific promoters.

[0014] In yet another embodiment, the present invention relates to a bacterial cell containing the hereinbefore described expression cassette. The bacterial cell can be an Agrobacterium tumefaciens cell or an Agrobacterium rhizogenes cell.

[0015] In still yet another embodiment, the present invention relates to a plant cell transformed with the hereinbefore described expression cassette, a transformed plant containing such a plant cell, and to seed obtained from such a transformed plant. The plant cell, transformed plant and seed can be from Zea mays L.

BRIEF DESCRIPTION OF THE FIGURES

[0016]FIG. 1A shows the 1770 base pair ZmMBD1 nucleic acid amplified from one-week seedling cDNA. The putative start codon is indicated by a solid underlined and the stop codon indicated by a wavy underline. FIG. 1B shows the 433 amino acid ZmMBD1 protein. The proline/alanine rich repeats are highlighted in alternating solid and wavy underlines. A putative bipartite nuclear localization signal predicted by PSORT (http://psort.nibb.ac.jp/) is in bold text.

[0017]FIG. 2A shows the 1728 base pair nucleotide sequence of ZmMBD2 nucleic acid shown with the putative start codon indicated by a solid underline and the stop codon is indicated by a wavy underline. FIG. 2B shows the translation of the coding region, which encodes a 428 amino acid protein. The proline/alanine rich repeats are underlined with alternating solid and wavy lines. The location of a putative bipartite nuclear localization signal predicted by PSORT (http://psort.nibb.ac.jp/) is in bold text.

[0018]FIG. 3 is a schematic diagram of MBD proteins. The ZmMBD and human methyl-CpG-binding proteins are indicated by white rectangles with the amino terminus on the left. The methyl-CpG-binding domain is shown by diagonal lines. The black rectangles indicate the region of proline/alanine rich repeats in the ZmMBD proteins. The location of a predicted coiled-coil domain is also indicated. All drawings are to scale (0.01 inch=1 amino acid).

[0019]FIG. 4 shows the alignment of ZmMBD1 and ZmMBD2 polypeptides. The amino acid sequences of the ZmMBD1 and ZmMBD2 polypeptides are aligned using CLUSTALW (http://dot.imgen.bcm.tmc.edu:9331/multi-align/Options/clustalw.html). The alignment is shaded using Boxshade (http://www.ch.embnet.org/software/BOX_form.html). Black shading indicates identical residues while gray shading indicates conservative substitutions. The proteins have substantial identity along their entire length. The first 100 amino acids are strongly conserved (96% identical).

[0020]FIG. 5 shows the alignment of proline/alanine rich repeats. The proline/alanine rich repeats of the ZmMBD proteins were aligned using CLUSTALW. FIG. 5A shows the alignments of ZmMBD1 repeats (underlined in FIG. 1B). The repeats were numbered 1-11 from the N-terminus towards the C-terminus. Black shading indicates identical residues while gray shading indicates similar residues. The consensus is indicated at the bottom with upper case letters representing completely conserved amino acids and lower case letters indicating the consensus amino acid. FIG. 1B shows the alignment of the ten proline/alanine-rich repeats of ZmMBD2 (underlined in FIG. 2B).

[0021]FIG. 6 shows methyl-CpG-binding domain alignments. FIG. 6A shows the methyl-CpG-binding domain of several mammalian proteins and the maize proteins is shown (hMBD1-AAD50371, hMBD2-AAD56597, hMBD3-NP_(—)003917, hMBD4-AAC68879, hMeCP2-CAA73190). Alignments were done using ClustalX and shading was done using Boxshade. Black shading indicates identical residues and similar residues are indicated by gray shading. FIG. 6B shows the alignment of methyl-CpG-binding domains with known structures and ZmMBD proteins is shown. The line above the sequences shows structural features, including beta sheets, loops, alpha helix, and a hairpin loop. In the sequences of MBD1 and MeCP2, amino acids that were identified as important for proper structure are indicated by bold text. The MBD1 amino acids that interact with DNA are underlined.

[0022]FIG. 7 shows the expression of ZmMBD1 and ZmMBD2. The expression pattern of ZmMBD1 and ZmMBD2 was tested by PCR amplification of CDNA. The source of the CDNA used in each reaction is indicated by the number above each lane and the key on the right. A set of control reactions using ubiquitin primers were also done. The gel showing the ubiquitin products is directly below the ZmMBD1 and ZmMBD2 gels. Roughly similar amount of ubiquitin products were observed in all lanes except pollen. The absence of ubiquitin procuts from the pollen CDNA may indicate that this CDNA is degraded or at a much lower concentration that the other cDNAs. FIG. 7A shows that ZmMBD1 is expressed at high levels in all tissues. The expression in BMS cell culture tissue is slightly lower than other tissues. FIG. 7B shows that ZmMBD2 is expressed at high levels in embryo, ear, immature tassel and root tissue but not in leaves or BMS cell cultures.

DEFINITIONS

[0023] Units, prefixes, and symbols can be denoted in the SI accepted form. Numeric ranges are inclusive of the numbers defining the range. Unless otherwise indicated, nucleic acids are written left to right in 5′ to 3′ orientation, respectively. The headings provided herein are not limitations of the various aspects or embodiments of the invention which can be had by reference to the specification as a whole. Accordingly, the terms defined immediately below are more fully defined by reference to the specification as a whole.

[0024] As used herein, the terms “amplify” or “amplified” as used interchangeably herein refer to the construction of multiple copies of a nucleic acid sequence or multiple copies complementary to the nucleic acid sequence using at least one of the nucleic acid sequences as a template. Amplification methods include the polymerase chain reaction (hereinafter “PCR”; described in U.S. Pat. Nos. 4,683,195 and 4,683,202), the ligase chain reaction (hereinafter “LCR”; described in EP-A-320,308 and EP-A-439,182), the transcription-based amplification system (hereinafter “TAS”), nucleic acid sequence based amplification (hereinafter “NASBA”, Cangene, Mississauga, Ontario; described in Proc. Natl. Acad. Sci., USA, 87:1874-1878 (1990); Nature, 350 (No. 6313): 91-92 (1991)), Q-Beta Replicase systems, and strand displacement amplification (hereinafter “SDA”). The product of amplification is referred to as an amplicon.

[0025] As used herein, the term “antibody” includes reference to an immunoglobulin molecule obtained by in vitro or in vivo generation of a humoral response, and includes both polyclonal and monoclonal antibodies. The term also includes genetically engineered forms such as chimeric antibodies (e.g., humanized murine antibodies), heteroconjugate antibodies (e.g., bispecific antibodies), and recombinant single chain Fe fragments (hereinafter “scFc”). The term “antibody” also includes antigen binding forms of antibodies (e.g., Fab¹ , F(ab¹)₂, Fab, Fe, and, inverted IgG (See, Pierce Catalog and Handbook, (1994-1995) Pierce Chemical Co., Rockford, Ill.)). An antibody immunologically reactive with a particular antigen can be generated in vivo or by recombinant methods such as by the selection of libraries of recombinant antibodies in phage or similar vectors (See, e.g. Huse et al., Science, 246:1275-1281 (1989); and Ward, et al., Nature, 341:544-546 (1989); and Vaughan et al., Nature Biotechnology, 14:309-314 (1996)).

[0026] As used herein, the term “antisense RNA” means an RNA sequence which is complementary to a sequence of bases in the mRNA in question in the sense that each base (or the majority of bases) in the antisense sequence (read in the 3′ to 5′ sense) is capable of pairing with the corresponding base (G with C, A with U) in the mRNA sequence read in the 5′ to 3′ sense.

[0027] As used herein, the term “conservatively modified variants” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or conservatively modified variants of the amino acid sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For example, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thereupon, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations” and represent one species of conservatively modified variation. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible “silent variation” of the nucleic acid. It is known by persons skilled in the art that each codon in a nucleic acid (except AUG, which is the only codon for the amino acid, methionine; and UGG, which is the only codon for the amino acid tryptophan) can be modified to yield a functionally identical molecule. Therefore, each silent variation of a nucleic acid which encodes a polypeptide of the present invention is implicit in each described polypeptide sequence.

[0028] With respect to amino acid sequences, persons skilled in the art will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art.

[0029] The following six groups each contain amino acids that are conservative substitutions for one another:

[0030] 1) Alanine (A), Serine (S), Threonine (T);

[0031] 2) Aspartic acid (D), Glutamic acid (E);

[0032] 3) Asparagine (N), Glutamine (Q);

[0033] 4) Arginine (R), Lysine (K);

[0034] 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and

[0035] 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). See also, Creighton (1984) Proteins W. H. Freeman and Company.

[0036] As used herein, the term “constitutive promoter” refers to a promoter which is active under most environmental conditions.

[0037] As used herein, the term “full length” when used in connection with a specified polynucleotide or encoded protein refers to having the entire amino acid sequence of, a native (i.e. non-synthetic), endogenous, catalytically active form of the specified protein. Methods for determine whether a sequence is full length are well known in the art. Examples of such methods which can be used include Northern or Western blots, primer extension, etc. Additionally, comparison to known full-length homologous sequences can also be used to identify full length sequences of the present invention.

[0038] As used herein, the term “heterologous” when used to describe nucleic acids or polypeptides refers to nucleic acids or polypeptides that originate from a foreign species, or, if from the same species, are substantially modified from their original form. For example, a promoter operably linked to a heterologous structural gene is from a species different from that from which the structural gene was derived, or, if from the same species, is different from any naturally occurring allelic variants.

[0039] The term “immunologically reactive conditions” as used herein, includes reference to conditions which allow an antibody, generated to a particular epitope of an antigen, to bind to that epitope to a detectably greater degree than the antibody binds to substantially all other epitopes, generally at least two times above background binding, preferably at least five times above background. Immunologically reactive conditions are dependent upon the format of the antibody binding reaction and typically are those utilized in immunoassay protocols.

[0040] As used herein, the term “inducible promoter” refers to a promoter which is under environmental control. Examples of environmental conditions that may effect transcription by inducible promoters include anaerobic conditions or the presence of light.

[0041] As used herein, the term “isolated” includes reference to material which is substantially or essentially free from components which normally accompany or interact with it as found in its naturally occurring environment. The isolated material optionally comprises material not found with the material in its natural environment. However, if the material is in its natural environment, the material has been synthetically, (e.g. non-naturally) altered by deliberate human intervention to a composition and/or placed in a locus in a cell (e.g., genome or subcellular organelle) not native to a material found in that environment.

[0042] Two polynucleotides or polypeptides are said to be “identical” if the sequence of nucleotides or amino acid residues, respectively, in the two sequences is the same when aligned (either manually for visual inspection or via the use of a computer algorithm or program) for maximum correspondence as described below. The terms “identical” or “percent identity” when used in the context of two or more polynucleotide or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a comparison window, as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. With respect to polypeptides or proteins having a “percent identity” or “percentage of sequence identity” one skilled in the art would recognize that residue positions that are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues possessing similar chemical and/or physical properties such as charge or hydrophobicity and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well-known to persons skilled in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity.

[0043] As used herein, the term “comparison window” includes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence may be compared to a reference sequence and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (e.g., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.

[0044] Generally, the comparison window is at least 20 contiguous nucleotides in length, and can be 30, 40, 50, 100, or even longer. Persons skilled in the art will recognize that to avoid a high similarity to a reference sequence due to inclusion of gaps in the polynucleotide sequence a gap penalty is typically introduced and is subtracted from the number of matches.

[0045] The alignment of polynucleotide and/or polypeptide sequences for the purposes of determine sequence identity and similarity can be by either manual alignment and visual inspection or via the use of some type of computer program or algorithm. In fact, a number of computer programs are available which can be used to align polynucleotide and/or polypeptide sequences are known in the art. For example, the programs available in the Wisconsin Sequence Analysis Package, Version 9 (available from the Genetics Computer Group, Madison, Wis., 52711), such as GAP, BESTFIT, FASTA and TFASTA. For example, the GAP program is capable of calculating both the identity and similarity between two polynucleotide or two polypeptide sequences. Specifically, the GAP program uses the homology alignment algorithm of Needleman and Wunsch (J. Mol. Biol., 48:443-453 (1970)). Another example of a useful computer program is PILEUP. PILEUP creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments to show relationship and percent sequence identity. It also plots a tree or dendogram showing the clustering relationships used to create the alignment. PILEUP uses a simplification of the progressive alignment method of Feng & Doolittle, J. Mol. Evol. 35:351-360 (1987). Yet another example of a useful computer program that can be used for determine percent sequence identity and sequence similarity is the BLAST algorithm (Altsuchul et al., J. Mol. Biol., 215:403-410 (1990)). The software for performing BLAST analysis is publicly available through the National Center for Biotechnology Information (http:www.ncbi.nlm.nih.gov/).

[0046] With respect to polynucleotide sequences, the term “substantial identity” means that a polynucleotide comprises a sequence that has at least 60% sequence identity, preferably at least 70% sequence identity, more preferably at least 80% sequence identity, even more preferably 90% sequence identity and most preferably at least 90% sequence identity, compared to a reference sequence using one of the alignment programs described herein conducted according to standard parameters. One skilled in the art will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning and the like. Substantial identity of amino acid sequences for these purposes normally means sequence identity of at least 60%, more preferably at least 70%, 80%, 90% identity, and most preferably at least 95% identity.

[0047] Polynucleotide sequences can also be considered to be substantially identical if two molecules hybridize to each other under stringent conditions. However, polynucleotides which do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides which they encode are substantially identical. This can occur when a copy of a polynucleotide is created using the maximum codon degeneracy permitted by the genetic code. One indication that two polynucleotide sequences are substantially identical if the polypeptide encoded by the first nucleic acid encodes is immunologically cross reactive with the polypeptide encoded by the second polynucleotide.

[0048] With peptides, the term “substantial identity” as used herein means that a peptide comprises a sequence having at least 60% sequence identity to a reference sequence, preferably 70% sequence identity, more preferably 80% sequence identity, even more preferably 90% sequence identity, and most preferably at least 95% sequence identity to the reference sequence over a specified comparison window. Preferably, optimal alignment is conducted using the homology alignment algorithm (GAP program discussed previously) of Needleman and Wunsch, J. Mol Bol., 48: 443-453 (1990). An indication that two peptide sequences are substantially identical is that one peptide is immunologically reactive with antibodies raised against the second peptide. Thereupon, a peptide is substantially identical to a second peptide where the two peptides differ only by a conservative substitution. Peptides which are “substantially similar” share sequences as described above except that any residue positions which are not identical differ only by conservative amino acid changes.

[0049] As used herein, the term “nucleic acid” refers to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise limited, encompasses known analogues having the essential nature of natural nucleotides in that they hybridize to single-stranded nucleic acids in a manner similar to naturally occurring nucleotides (e.g., peptide nucleic acids).

[0050] As used herein, the term “nucleotide(s)” refers to a macromolecule containing a sugar (either a ribose or deoxyribose), a phosphate group and a nitrogenous base.

[0051] As used herein, the term “operably linked” includes reference to a functional linkage between a promoter and a second sequence, wherein the promoter sequence initiates and mediates transcription of the DNA sequence corresponding to the second sequence. Generally, operably linked means that the polynucleotide sequences being linked are contiguous and, where necessary to joint two protein coding regions, contiguous and in the same reading frame.

[0052] As used herein, the term “plant” includes reference to whole plants, plant organs (e.g., leaves, stems, flowers, roots, etc.), seeds and plant cells and progeny of the same. Plant cell, as used herein, includes, but is not limited to, suspension cultures, embryos, meristematic regions, callus tissue, shoots, gametophytes, sporophytes, pollen and microspores. The class of plants which can be used in the methods of the present invention are generally as broad as the class of higher plants amenable to transformation techniques, including angiosperms (monocotyledonous and dicotyledonous plants) as well as gymnosperms (e.g. Coniferophyta (conifers, Cycadophyta (cycads), Ginkgophyta (maidenhair tree) and Gnetophyta (gnetophytes)). The term “plant” as used herein also includes plants of a variety of ploidy levels, such as polyploid, diploid, haploid and hemizygous.

[0053] As used herein, the term “plant promoter” refers to a promoter capable of initiating transcription in plant cells.

[0054] As used herein, the term “polymorphic variant” in connection with a polynucleotide sequence refers to a variation in the polynucleotide sequence of a particular gene between individuals of a given species. Polymorphic variants may also encompass “single nucleotide polymorphisms” (SNPs) in which the polynucleotide sequence varies by one base. The presence of SNPs may be indicative of a certain population for a disease state or propensity for a disease state.

[0055] As used herein, the term “polynucleotide” refers to a deoxyribopolynucleotide, ribopolynucleotide, or analogs thereof that have the essential nature of a natural ribonucleotide in that they hybridize, under stringent hybridization conditions, to substantially the same nucleotide sequence as naturally occurring nucleotides and/or allow translation into the same amino acid(s) as the naturally occurring nucleotide(s). A polynucleotide can be full length or a subsequence of a native or heterologous structural or regulatory gene. Unless otherwise indicated, the term includes reference to the specified sequence as well as the complementary sequence thereof. Thereupon, DNAs or RNAs with backbones modified for stability or for other reasons are “polynucleotides” as that term is intended herein. Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, are polynucleotides as the term is used herein. Moreover, as used herein, the term polynucleotide includes such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including, but not limited to, simple and complex cells.

[0056] As used herein, the terms “polypeptide”, “peptide” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The essential nature of such analogues of naturally occurring amino acids is that, when incorporated into a protein, that protein is specifically reactive to antibodies elicited to the same protein but consisting entirely of naturally occurring amino acids. The terms “polypeptide”, “peptide” and “protein” are also inclusive of modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.

[0057] As used herein, the term “promoter” refers to a region of DNA upstream from the start of transcription and involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. A promoter can optionally include distal enhancers or repressor elements which can be located several thousand base pairs from the start site of transcription.

[0058] As used herein, the term “recombinant” includes reference to a cell, or nucleic acid, or vector, that has been modified by the introduction of a heterologous nucleic acid or the alteration of a native nucleic acid to a form not native to that cell, or that the cell is derived from a cell so modified. For example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.

[0059] As used herein, the term “recombinant expression cassette” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements which permit transcription of a particular nucleic acid in a target cell. The expression vector can be part of a plasmid, virus, or nucleic acid fragment. Typically, the recombinant expression cassette portion of the expression vector includes a nucleic acid to be transcribed, and a promoter.

[0060] As used herein, the terms “residue” or “amino acid” or “amino acid residue” are used interchangeably herein to refer to an amino acid that is incorporated into a protein, polypeptide or peptide. The amino acid may be a naturally occurring amino acid, and unless otherwise limited, may encompass known analogs of natural amino acids that can function in a similar manner as naturally occurring amino acids.

[0061] As used herein, the term “selective hybridization” or “selectively hybridizes” are used interchangeably herein includes reference to hybridization, under stringent hybridization conditions, of a nucleic acid sequence to a specified nucleic acid target sequence to a detectably greater degree (e.g., at least 2-fold over background) than its hybridization to non-target nucleic acid sequences and to the substantial exclusion of non-target nucleic acids. Selectively hybridizing sequences typically have about at least 80% sequence identity, preferably 90% sequence identity, and most preferably 100% sequence identity (e.g., complementary) with each other.

[0062] As used herein, the term, “specifically binds” includes reference to the preferential association of a ligand, in whole or part, with a particular target molecule (i.e., “binding partner” or “binding moiety” relative to compositions lacking that target molecule). It is, of course, recognized that a certain degree of non-specific interaction may occur between a ligand and a non-target molecule. Nevertheless, specific binding, may be distinguished as mediated through specific recognition of the target molecule. Typically, specific binding results in a much stronger association between the ligand and the target molecule than between the ligand and non-target molecule. Specific binding by an antibody to a protein under such conditions requires an antibody that is selected for its specificity for a particular protein. The affinity constant of the antibody binding site for its cognate monovalent antigen is at least 10⁷, usually at least 10⁹, more preferably at least 10¹⁰, and most preferably at least 10¹¹ liters/mole.

[0063] As used herein, the terms “stringent hybridization” conditions or “stringent conditions” refers to conditions under which a probe will hybridize to its target subsequence, typically in a complex mixture of nucleic acid, but to no other sequences. Stringent conditions are sequence dependent and are different under different environmental parameters. An extensive guide to hybridization of nucleic acids is found in Tijssen, Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid Probes Part 1, Chapter 2 “Overview of Principles of Hybridization and the Strategy of Nucleic Acid Probe Assays” Elsevier, N.Y. (1993). Generally, highly stringent conditions are selected to be about 5° C. −10 ° C. lower than the thermal melting point (T_(m)) for the specific sequence at a defined ionic strength and pH. The T_(m) is the temperature (under defined ionic strength and pH and nucleic concentration) at which 50% of the target sequence hybridizes to a perfectly matched probe. Stringent conditions are those in which the salt concentration is less than about 1.0M sodium ion, typically about 0.01 to 1.0M sodium ion concentration (or other salts) at a pH of 7.0 to 8.3 and at a temperature of at least about 30° C. for short probes (such as those having a length between about 10 to 50 nucleotides) and at least about 60° C. for long probes (such as those having a length greater than 50 nucleotides). In contrast, low stringency conditions are at about 15-30° C. below the T_(m). Stringent hybridization conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize at higher temperatures.

[0064] As used herein, the term “tissue-specific promoter” includes reference to a promoter in which expression of an operably linked gene is limited to a particular tissue or tissues.

[0065] As used herein, the term “transgenic plant” includes reference to a plant modified by introduction of a heterologous polynucleotide. Generally, the heterologous polynucleotide is a ZmMBD1 or ZmMBD2 polynucleotide or subsequences thereof.

[0066] As used herein, the term “ZmMBD1 gene” refers to a gene of the present invention, specifically, the heterologous genomic form of a full length ZmMBD1 polynucleotide.

[0067] As used herein, the term “ZmMBD1 nucleic acid” refers to a nucleic acid of the present invention, specifically, a nucleic acid comprising a polynucleotide of the present invention encoding a ZmMBD1 polypeptide (hereinafter “ZmMBD1 polynucleotide”). An example of a ZmMBD1 polynucleotide (cDNA) is shown in SEQ ID NO: 1.

[0068] As used herein, the terms “ZmMBD1 polypeptide”, “ZmMBD1 peptide” or “ZmMBD1 protein” as used interchangeable herein refer to a polypeptide shown in SEQ ID NO: 2. The term also includes fragments, variants, homologs, alleles or precursors (e.g., preproproteins or proproteins) thereof.

[0069] As used herein, the term “ZmMBD2 gene” refers to a gene of the present invention, specifically, the heterologous genomic form of a full length ZmMBD2 polynucleotide.

[0070] As used herein, the term “ZmMBD2 nucleic acid” refers to a nucleic acid of the present invention, specifically, a nucleic acid comprising a polynucleotide of the present invention encoding a ZmMBD2 polypeptide (hereinafter a “ZmMBD2 polynucleotide”). An example of a ZmMBD2 polynucleotide (cDNA) is shown in SEQ ID NO: 3.

[0071] As used herein, the terms “ZmMBD2 polypeptide”, “ZmMBD2 peptide” or “ZmMBD2 protein” as used interchangeably herein refer to a polypeptide shown in SEQ ID NO: 4. The term also includes fragments, variants, homologs, alleles or precursors (e.g., preproproteins or proproteins) thereof. A “ZmMBD2 protein” is a protein of the present invention and comprises a ZmMBD2 polypeptide.

Sequence Listings

[0072] The present application also contains a sequence listing that contains thirteen (13) sequences. The sequence listing contains nucleotide sequences and amino acid sequences. For the nucleotide sequences, the base pairs are represented by the following base codes: Symbol Meaning A A; adenine C C; cytosine G G; guanine T T; thymine U U; uracil M A or C R A or G W A or T/U S C or G Y C or T/U K G or T/U V A or C or G; not T/U H A or C or T/U; not G D A or G or T/U; not C B C or G or T/U; not A N (A or C or G or T/U)

[0073] The amino acids shown in the application are in the L-form and are represented by the following amino acid-three letter abbreviations: !Abbreviation? Amino acid name Ala L-Alanine Arg L-Arginine Asn L-Asparagine Asp L-Aspartic Acid Asx L-Aspartic Acid or Asparagine Cys L-Cysteine Glu L-Glutamic Acid Gln L-Glutamine Glx L-Glutamine or Glutamic Acid Gly L-Glycine His L-Histidine Ile L-Isoleucine Leu L-Leucine Lys L-Lysine Met L-Methionine Phe L-Phenylalanine Pro L-Proline Ser L-Serine Thr L-Threonine Trp L-Tryptophan Tyr L-Tyrosine Val L-Valine Xaa L-Unknown or other

[0074] Introduction

[0075] The present invention is based, at least in part, on the discovery and cloning of two (2) methyl CpG binding domain nucleic acids from Zea mays L. (maize) termed the ZmMBD1 nucleic acid and the ZmMBD2 nucleic acids, respectively.

[0076] The present invention is applicable to a broad range of types of plants, including, but not limited to, Zea mays L., Oryza sativa, Secale cereale, Triticum aestivum, Daucus carota, Brassica oleracea, Cucumis melo, Cucumis sativus, Latuca sativa, Solanum tubersoum, Lycopersicon esculentum, Phaseolus vulgaris, and Brassica napus.

[0077] Nucleic Acids

[0078] In one embodiment, the present invention relates to isolated nucleic acids of DNA, RNA, and analogs and/or chimeras thereof, comprising a polynucleotide, wherein said polynucleotide is a ZmMBD1 or ZmMBD2 polynucleotide which encodes a polypeptide of SEQ ID NO: 2 (a ZmMBD1 polypeptide) or SEQ ID NO: 4 (a ZmMBD2 polypeptide), and conservatively modified variants thereof. It is known in the art that the degeneracy of the genetic code allows for a plurality of polynucleotides to encode for the identical amino acid sequence. These “silent variations”, as they are common referred to, can be used to selectively hybridize and detect polymorphic variants of the polynucleotides of the present invention.

[0079] An example of a ZmMBD1 polynucleotide which encodes the ZmMBD1 polypeptide of SEQ ID NO: 2 is shown in SEQ ID NO: 1. The polynucleotide of SEQ ID NO: 1 is 1770 base pairs in length.

[0080] An example of a ZmMBD2 polynucleotide which encodes the ZmMBD2 polypeptide of SEQ ID NO: 4 is shown in SEQ ID NO: 3. The polynucleotide of SEQ ID NO: 3 is 1728 base pairs in length.

[0081] In another embodiment, the present invention also provides isolated of nucleic acids comprising polynucleotides encoding conservatively modified variants of a ZmMBD1 or ZmMBD2 polypeptides of SEQ ID NOS: 2 and 4. Such conservatively modified variants can be used for a number of useful purposes, such as, but not limited to, the generation or selection of antibodies immunoreactive to the non-variant polypeptide. Also, in yet another embodiment, the present invention also relates to isolated nucleic acids comprising polynucleotides encoding one or more polymorphic variants of polynucleotides. Polymorphic variants are used to follow the segregation of chromosome regions and are typically used in marker assisted selection methods for crop improvement.

[0082] In another embodiment, the present invention relates to the isolation nucleic acids comprising polynucleotides of the present invention which selectively hybridize, under selective hybridization conditions (i.e. stringent hybridization conditions), to the ZmMBD1 or ZmMBD2 polynucleotide. The isolation of such nucleic acids can be accomplished by a number of techniques. For example, oligonucleotide probes based upon the ZmMBD1 and ZmMBD2 polynucleotides described herein can be used to identify, isolate or amplify partial or full length clones in a deposited library (such as a cDNA or genomic DNA library). For example, a cDNA or genomic library can be screened using a probe based upon the sequence of the ZmMBD1 or ZmMBD2 polynucleotides described herein. These probes can be used to hybridize with genomic DNA or cDNA sequences to isolate homologous genes in the same or different plant species.

[0083] Alternatively, nucleic acids of interest can be amplified from nucleic acid samples using various amplification techniques known in the art. For example, PCR can be used to amplify the sequences of the ZmMBD1 or ZmMBD2 genes directly from genomic DNA, from cDNA, from genomic libraries or cDNA libraries. PCR and other in vitro amplification methods (such as LCR, etc.) can be used to clone nucleic acid sequences that code for proteins to be expressed, to make nucleic acids for use as probes for detecting the presence of the desired mRNA in samples, for nucleic acid sequencing or for other purposes.

[0084] In yet another embodiment, the present invention relates to isolated nucleic acid comprising polynucleotides, wherein the polynucleotides of said nucleic acid have a specified identity at the nucleotide level to the previously described ZmMBD1 or ZmMBD2 polynucleotides. The percentage of identity is at least 60%, preferably 70%, more preferably 80%, even more preferably 90% and most preferably 95%.

[0085] In yet another embodiment, the present invention relates to isolated nucleic acids comprising polynucleotides complementary to the previously described ZmMBD1 or ZmMBD2 polynucleotides. One skilled in the art will recognize that complementary sequences will base pair throughout their entire length with the previously described ZmMBD1 or ZmMBD2 polynucleotides (meaning that they have 100% sequence identity over their entire length). Complementary bases associate through hydrogen bonding in double stranded nucleic acids. Base pairs known to be complementary include the following: adenine and thymine, guanine and cytosine and adenine and uracil.

[0086] In yet another embodiment, the present invention relates to isolated nucleic acids comprising polynucleotides which comprise at least 15 contiguous bases from the previously described ZmMBD1 or ZmMBD2 polynucleotides. More specifically, the length of the polynucleotides can be from about 15 continguous bases to the length of the Mez1 or Mez polynucleotide from which the polynucleotide is a subsequence of. For example, such polynucleotides can be 15, 35, 55, 75, 95, 100, 200, 400, 500, 750, etc. continguous nucleotides in length from the previously described ZmMBD1 or ZmMBD2 polypeptide. In addition, such subsequences can optionally comprise or lack certain structural characteristics from the ZmMBD1 or ZmMBD2 polynucleotides from which it is derived.

[0087] Polypeptides

[0088] In one embodiment, the present invention relates to a ZmMBD1 polypeptide of SEQ ID NO: 2. The ZmMBD1 polypeptide is 433 amino acids in length.

[0089] In a second embodiment, the present invention relates to a ZmMBD2 polypeptide of SEQ ID NO: 4. The ZmMBD2 polypeptide is 428 amino acids in length.

[0090] In another embodiment, the present invention relates to a polypeptide having a specified percentage of sequence identity with the ZmMBD1 or ZmMBD2 polypeptide of the present invention. The percentage of sequence identity is at least 60%, preferably 70%, more preferably 80%, even more preferably 90% and most preferably 95%.

[0091] The present invention also provides antibodies which specifically react with the ZmMBD1 or ZmMBD2 polypeptides of the present invention under immunologically reactive conditions. An antibody immunologically reactive with a particular antigen can be generated in vivo or by recombinant methods such as by selection of libraries of recombinant antibodies in phage or similar vectors.

[0092] Many methods of making antibodies are known to persons skilled in the art. A number of immunogens can be used to produce antibodies specifically reactive to the isolated ZmMBD1 or ZmMBD2 polypeptides of the present invention under immunologically reactive conditions. An isolated recombinant, synthetic, or native isolated ZmMBD1 or ZmMBD2 polypeptide of the present invention is the preferred immunogens (antigen) for the production of monoclonal or polyclonal antibodies.

[0093] The ZmMBD1 or ZmMBD2 polypeptide can be injected into an animal capable of producing antibodies. Either monoclonal or polyclonal antibodies can be generated for subsequent use in immunoassays to measure the presence and quantity of the ZmMBD1 or ZmMBD2 polypeptide. Methods of producing monoclonal or polyclonal antibodies are known to persons skilled in the art (See, Coligan, Current Protocols in Immunology Wiley/Greene, N.Y. (1991); Harlow and Lane, Antibodies: A Laboratory Manual Cold Spring Harbor Press, N.Y. (1989)); and Goding Monoclonal Antibodies: Principles and Practice (2d ed.) Academic Press, New York, N.Y. (1986)).

[0094] The ZmMBD1 or ZmMBD2 polypeptides and antibodies can be labeled by joining, either covalently or non-covalently, a substance which provides for a detectable signal. A wide variety of labels and conjugation techniques are known to persons skilled in the art. Suitable labels include radionucleotides, enzymes, substrates, cofactors, inhibitors, fluorescent moieties, chemiluminescent moieties, magnetic particles, and the like. Patents teaching the use of such labels include U.S. Pat. Nos. 3,817,837, 3,850,752, 3,939,350, 3,996,345, 4,277,437, 4,275,149, and 4,366,241.

[0095] The antibodies of the present invention can be used to screen plants for the expression of the ZmMBD1 or ZmMBD2 polypeptides of the present invention. The antibodies of the present invention can also be used for affinity chromatography for the purpose of isolating ZmMBD1 or ZmMBD2 polypeptides.

[0096] The present invention further provides ZmMBD1 or ZmMBD2 polypeptides that specifically bind, under immunologically reactive conditions, to an antibody generated against a defined immunogen, such as an immunogen consisting of the ZmMBD1 or ZmMBD2 polypeptides. Immunogens will generally have a length of at least 10 contiguous amino acids from the ZmMBD1 or ZmMBD2 polypeptides of the present invention, respectively.

[0097] A variety of immunoassay formats are appropriate for selecting antibodies specifically reactive with a particular protein. For example, solid-phase ELISA immunoassays are routinely used to select monoclonal antibodies specifically reactive with a protein (See Harlow and Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New York (1988), for a description of immunoassay formats and conditions that can be used to determine specific reactivity). The antibody may be polyclonal but preferably is monoclonal. Generally, antibodies cross-reactive to ZmMBD1 or ZmMBD2 polypeptides are removed by immunoabsorbtion.

[0098] Immunoassays in the competitive binding format are typically used for cross-reactivity determinations. For example, an immunogenic ZmMBD1 or ZmMBD2 polypeptide can be immobilized to a solid support. Polypeptides added to the assay compete with the binding of the antisera to the immobilized antigen. The ability of the above polypeptides to compete with the binding of the antisera to the immobilized ZmMBD1 or ZmMBD2 polypeptide is compared to the immunogenic ZmMBD1 or ZmMBD2 polypeptide. The percent cross-reactivity for the above proteins is calculated, using standard calculations known to persons skilled in the art.

[0099] The immunoabsorbed and pooled antisera are then used in a competitive binding immunoassay to compare a second “target” polypeptide to the immunogenic polypeptide. In order to make this comparison, the two polypeptides are each assayed at a wide range of concentrations and the amount of each polypeptide required to inhibit 50% of the binding of the antisera to the immobilized protein is determined using standard techniques. If the amount of the target polypeptide required is less than twice the amount of the immunogenic polypeptide that is required, then the target polypeptide is said to specifically bind to an antibody generated to the immunogenic protein. As a final determination of specificity, the pooled antisera is fully immunoabsorbed with the immunogenic polypeptide until no binding to the polypeptide used in the immunoabsorbtion is detectable. The fully immunoabsorbed antisera is then tested for reactivity with the test polypeptide. If no reactivity is observed, then the test polypeptide is specifically bound by the antisera elicited by the immunogenic protein.

[0100] Production of Recombinant Expression Cassettes

[0101] Isolated nucleic acids of the present invention can be used in recombinant expression cassettes. One of ordinary skill in the art will recognize that a nucleic acid used in the recombinant expression cassettes described herein encoding a functional ZmMBD1 or ZmMBD2 polypeptide need not have a sequence identical to the exemplified nucleic acids disclosed herein and does not need to be full length, so long as the desired functional domain of the ZmMBD1 or ZmMBD2 protein is expressed.

[0102] A nucleic acid comprising a polynucleotide coding for the desired functional ZmMBD1 or ZmMBD2 polypeptide, for example a cDNA or a genomic sequence encoding a full length protein, can be used to construct a recombinant expression cassette which can be introduced into a desired plant. An expression cassette will typically comprise the functional ZmMBD1 or ZmMBD2 nucleic acid operably linked in either the sense or antisense direction to transcriptional and translational initiation regulatory sequences which will direct the transcription of the sequence from the functional ZmMBD1 or ZmMBD2 nucleic acid in the intended tissues for the transformed plant. Examples of transcriptional and translational initiation regions that can be used in the recombinant expression cassette are well known in the art.

[0103] The recombinant expression cassette will contain a promoter which is used to direct expression of the polynucleotides of the present invention in one, more than one, or in all of the tissues of a regenerated plant. For example, a constitutive plant promoter may be employed which will direct expression of the functional ZmMBD1 or ZmMBD2 polypeptide in all tissues of a regenerated plant. Examples of constitutive promoters includes, but is not limited to, the cauliflower mosaic virus (hereinafter “CaMV”) 35S transcription initiation region, the NOS promoter, the RUBISCO promoter, the 1′ or 2′—promoter derived from T-DNA of Agrobacterium tumefaciens, etc. The determination of a suitable constitutive plant promoter to be used in the recombinant expression cassette can readily be determined by persons skilled in the art.

[0104] Alternatively, an inducible plant promoter can be used. An inducible plant promoter may direct expression of the ZmMBD1 or ZmMBD2 nucleic acid in specific tissue or under more precise environmental or developmental control in a regenerated plant. Examples of environmental conditions that may effect transcription by inducible promoters include pathogen attack, anaerobic conditions, or the presence of light. Examples of inducible promoters include, but are not limited to, the Hsp70 promoter (which is inducible by heat stress), the PPDK promoter (which is inducible by light), etc.

[0105] Promoters derived from the ZmMBD1 or ZmMBD2 genes can be used to direct expression. These promoters can also be used to direct expression of heterologous sequences. The promoters can be used, for example, in recombinant expression cassettes to drive expression of the ZmMBD1 or ZmMBD2 nucleic acids of the present invention or heterologous sequences.

[0106] Such promoters can be identified as follows. The 5′ portions of the ZmMBD1 or ZmMBD2 genes are analyzed for sequences characteristic of promoter sequences. For instance, promoter sequence elements include the TATA box consensus sequence (TATAAT), which is usually 20 to 30 base pairs upstream of the transcription start site. In plants, further upstream from the TATA box, at positions −80 to −100, there is typically a promoter element with a series of adenines surrounding the trinucleotide G (or T) N G. (See, J. Messing et al., in Genetic Engineering in Plants, pp. 221-227 (Kosage, Meredith and Hollaender, eds. 1983)).

[0107] If proper polypeptide expression is desired, a polyadenylation region at the 3′-end of the ZmMBD1 or ZmMBD2 polynucleotide coding region should be included. The polyadenylation region can be derived from a natural gene, from a variety of other plant genes, or from T-DNA. For example, polyadenylation regions can be derived from the nopaline synthase or octopine synthase genes.

[0108] The expression cassette comprising the ZmMBD1 or ZmMBD2 nucleic acids will typically comprise one or more marker genes which confers a selectable phenotype on plant cells. For example, the marker gene can encode biocide resistance, particularly antibiotic resistance, such as resistance to kanamycin, G4 18, bleomycin, hygromycin, or herbicide resistance, such as resistance to chlorosulforon.

[0109] As discussed briefly above, the ZmMBD1 or ZmMBD2 nucleic acids can be inserted into a recombinant expression cassette in the antisense direction. Expression of the ZmMBD1 or ZmMBD2 nucleic acid in antisense direction will result in the production of antisense RNA. It is well known to persons skilled in the art that a cell manufactures protein by transcribing the DNA of the gene encoding a protein to produce RNA, which is then processed to messenger RNA (hereinafter “mRNA”) (e.g., by the removal of introns) and finally translated by ribosomes into protein. This process may be inhibited in the cell by the presence of antisense RNA. It is believed that this inhibition takes place by formation of a complex between the two complementary strands of RNA, thus preventing the formation of protein. It is presently unclear how this mechanism works. However, it is believed that the complex may interfere with further translation, degrade the mRNA, or have more than one of these effects. This antisense RNA can be produced in the cell by transformation of the cell with an appropriate recombinant expression cassette designed to transcribe the non-template strand (as opposed to the template strand) of the relevant gene (or of a nucleic acid sequence showing substantial identity therewith).

[0110] The use of antisense RNA to downregulate the expression of specific plant genes is well known. Reduction in gene expression has been determined to led to changes in the phenotype of a plant, either at the level of gross visible phenotypic difference (see van der Krol et al., Nature, 333:866-869 (1988)), or at a more subtle biochemical level (Smith et al., Nature, 334:724-726 (1988)). Another method for inhibiting gene expression in transgenic plants involves the use of sense RNA transcribed from an exogenous template to downregulate the expression of specific plant genes (See, Jorgensen, Keystone Symposium “Improved Crop and Plant Products through Biotechnology”, Abstract X1-022 (1994)). Thereupon, both antisense and sense RNA can be used to achieve downregulation of gene expression in plants, which are encompassed by the present invention.

[0111] Production of Transgenic Plants

[0112] Techniques for transforming a wide variety of higher plant species using the recombinant expression cassettes hereinbefore described are well known and described in the technical and scientific literature (See, for example, Weising et al., Ann. Rev. Genet., 22:421-477 (1988)).

[0113] The hereinbefore described recombinant expression cassettes can be introduced into the genome of a desired plant host by a variety of conventional techniques which are well known to persons skilled in the art. For example, the recombinant expression cassette can be introduced directly into the genomic DNA of the plant cell using techniques such as electroporation, PEG poration, particle bombardment, silicon fiber delivery, and microinjection of plant cell protoplasts or embryogenic callus, or the expression cassettes can be introduced directly to plant tissue using ballistic methods, such as DNA particle bombardment. Alternatively, the expression cassettes may be combined with suitable T-DNA flanking regions and introduced into a conventional Agrobacterium tumefaciens or Agrobacterium rhizogenes host vector. The virulence functions of the Agrobacterium host will direct the insertion of the expression cassette and adjacent marker gene into the plant cell DNA when the cell is infected by the bacteria.

[0114] Plants which can be transformed with the recombinant expression cassette of the present invention include, but are not limited to, Zea mays L., Oryza sativa, Secale cereale, Triticum aestivum, Daucus carota, Brassica oleracea, Cucumis melo, Cucumis sativus, Latuca saliva, Solanum tubersoum, Lycopersicon esculentum, Phaseolus vulgaris, Brassica napus, etc.

[0115] Transformation techniques are well known to persons skilled in the art. For example, the introduction of expression cassettes using polyethylene glycol precipitation is described in Paszkowski et al., EMBO J., 3:2712-2722 (1984). Electroporation techniques are described in Fromm et al., Proc. Natl. Acad. Sci. USA, 82:5824 (1985). Biolistic transformation techniques are described in Klein et al., Nature, 327:70-73 (1987).

[0116]Agrobacterium tumefaciens-mediated transformation techniques are well known to persons skilled in the art (See, for example Horsch et al., Science 233:496-498 (1984), and Fraley et al., Proc. Natl. Acad. Sci. USA, 80:4803 (1983)). Although Agrobacterium is useful primarily in dicots, certain monocots can be transformed by Agrobacterium. U.S. Pat. No. 5,550,318 describes Agrobacterium transformation of maize.

[0117] Moreover, the following methods of transfection or transformation can also be used: (a) Agrobacterium rhizogenes-mediated transformation (See, Lichtenstein and Fuller In Genetic Engineering, vol. 6, PWJ Rigby, Ed., London, Academic Press, (1987)); (b) liposome-mediated DNA uptake (See, Freeman et al., Plant Cell Physiol., 25:1353 (1984)); and (3) the vortexing method (See, Kindle, Proc. Natl. Acad. Sci. USA, 87:1228 (1990)).

[0118] Transformed plant cells which are derived by any of the above transformation techniques can be cultured to regenerate a whole plant which possesses the transformed genotype. Such regeneration techniques rely on manipulation of certain phytohormones in a tissue culture growth medium, typically relying on a biocide and/or herbicide marker which has been introduced together with the ZmMBD1 or ZmMBD2 nucleic acid. Plant regeneration from cultured protoplasts is described in Evans et al., Protoplasts Isolation and Culture, Handbook of Plant Cell Culture, pp. 124-176, MacMillian Publishing Company, New York, 1983; and Binding; Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, Boca Raton, 1985. Regeneration can also be obtained from plant callus, explants, organs, or parts thereof. Such regeneration techniques are described generally in Klee et al., Ann. Ref. of plant Phys., 38:467-486 (1987).

[0119] One of ordinary skill in the art will recognize that after the expression cassette is stably incorporated in transgenic plants and confirmed to be operable, it can be introduced into other plants by sexual crossing. Any of a number of standard breeding techniques can be used, depending upon the species to be crossed.

[0120] Transgenic plants containing the expression cassettes described herein can be identified by using restriction enzymes or High Performance Liquid Chromatography. Techniques for restriction enzymes and High Performance Liquid Chromatography are well known to persons skilled the art. Transgenic plants containing the expression cassettes described herein can be identified by using a Northern Blot analysis which is well known to persons skilled in the art.

[0121] The hereinbefore described expression cassettes can be inserted into a plant in order to enhance the expression of a silenced targeted gene or to reactivate a silenced gene in a plant in vivo. The hereinbefore described expression cassettes of the present invention containing the ZmMBD1 and/or ZmMBD2 nucleic acids in the antisense direction can be inserted into a plant. The antisense RNA produced by the hereinbefore described expression cassettes can then form a complex with the endogenous mRNA from the ZmMBD1 and/or ZmMBD2 nucleic acids within the plant. This complex should effectively “knock-out” the function of the endogenous ZmMBD1 and/or ZmMBD2 genes. By “knocking-out” the function of the endogenous ZmMBD1 and/or ZmMBD2 genes it would be possible to relieve the methylation induced silencing that is mediated by these proteins. This could be useful in relieving the silencing of introduced transgenes or endogenous genes.

[0122] Synthetic Polypeptides and Purification of Polypeptides

[0123] In addition to being produced recombinantly, the polypeptides of the present invention can also be produced synthetically, using techniques known in the art. For example, polypeptides having a length of about 50 amino acids can be synthesized using solid phase synthesis techniques, such as those described by Barany and Merrifield, Solid-Phase Peptide Synthesis, pp. 3-284 in The Peptides. Analysis, Synthesis, Biology. Vol. 2: Special Methods in Peptide Synthesis, Part A.; Merrifield et al., J. Am. Chem. Soc. 85:2149-2156 (1963). Polypeptides having a length greater than about 50 amino acids can be synthesized by condensation of the amino and carboxy termini of shorter fragments, a technique which is well known to persons skilled in the art.

[0124] Polypeptides of the present invention produced either recombinantly or synthetically, can be purified using standard techniques known to those persons skilled in the art, including, but not limited to, column chromatography, selective precipitation with ammonium sulfate, affinity chromatography, etc.

[0125] The following Examples are offered by way of illustration, not limitation.

EXAMPLE 1 Isolation and Cloning of ZmMBD1 and ZmMBD2

[0126] Cloning of ZmMBD1 and ZmMBD2: The MBD of human MeCP2 was used in a TBLASTN search of the public maize EST database. Two distinct contigs with significant similarity to the MBD were found and used to design RACE primers. The primers used were ZmMBD1F1-CGA GAG CGA GAG CAA AGA GCT GAG C (SEQ ID NO: 5), ZmMBD1R1-CTC TGC CTC CTT GCC AGT TTC AGC (SEQ ID NO: 6), ZmMBD2F1-GGG CAG AGC AAG AGC TAG GGA TAA CC (SEQ ID NO: 7), and ZmMBD2R1-C ATC TCC ACG TCA GTC TCC TTT GTG C (SEQ ID NO: 8). PCR conditions were as follows: 94° for 2′, 5 cycles of 94° for 1′, 72° for 4′, 5 cycles of 94° for 1′, 70° for 4′, 25 cycles of 94° for 1′, 68° for 4′, the 72° for 7′. Each 50 μl reaction contained 1 μl of a 10 μM primer stock, 5 μl of Marathon cDNA (Clontech, Palo Alto Calif.), 5 μl 10 buffer, 0.6 μl 25 mM dNTP's (Promega, Madison Wis.) and 1 μl Advantage2 polymerase (Clontech, Palo Alto Calif.) and 36.4 μl ddH₂O. RACE products were purified in a 1% LMP agarose gel and cloned into pGEM-T Easy (Promega, Madison Wis.).

[0127] Sequencing

[0128] The plasmids were sequenced using BigDye terminator cycle sequencing on an ABI sequencer (Perkin-Elmer Applied Biosystems). Sequencing reactions were done in a 10 μl volume with 320 ng DNA and 10 pg of primer. Primers used were as follows: T7 (Promega Madison Wis.), SP6 (Promega Madison Wis.), ZmMBD1F3 - GGA GAC TGA TGT GGA GAT GAA GCC (SEQ ID NO:9), ZmMBD1R3 - GTT GCG GCT TCA GGT GCA CTT C (SEQ ID NO:10), ZmMBD1F4 - GCA ACC CAA CTG AGG ATT CGG (SEQ ID NO:11), ZmMBD2F3 - CCT GCT GAA GAG GCG AAG GAA G (SEQ ID NO:12), ZmMBD2R3 - GCG GTG TTC TCT AGT GGA GCG (SEQ ID NO:13).

[0129] RT-PCR Analysis

[0130] Total RNA was extracted from tissues including embryo, leaf, immature ear, immature tassel, 3-day root, pollen and BMS (Black Mexican Sweet) suspension cultures using TRIzol (Life Technologies Gibco/BRL). PolyA+RNA, isolated using PolyAtract (Promega, Madison Wis.) was used to make cDNA with Marathon cDNA Amplification Kit (Clontech, Palo Alto Wis.). 2 ng of cDNA was used in each PCR reaction. The primers used were: ZmMBD1F1, ZmMBD1R1, ZmMBD2F1, and ZmMBD2R1. Cycling conditions were as follows: 94° 2′, 5 cycles of 94° for 30″, 70° for 30″, 72° for 1′, 5 cycles of 94° for 30″, 67.5° for 30″, 72° for 1′, then 25 cycles of 94° for 30″, 65° for 30″, 72° for 1′ followed by 72° for 7′. Each 25 μl reaction contained 1 μl of a 10 μM primer solution for each primer, 2 ng cDNA, 2.5 μl 10× buffer, 2 μl 25 mM MgCl₂, 0.3 μl 25 mM dNTP's (Promega, Palo Alto Calif), 0.2 μl Taq polymerase (Promega, Palo Alto Calif.) and 17 μl ddH₂O.

[0131] Sequence Analysis

[0132] The sequences were assembled through the contig assembly program (http://gcg.tigem.it/ASSEMBLY/assemble.html). Reverse complement, translation and ClustalW were all accessed from the ABCC sequence analysis page (http://biosci.cbs.umn.edu/seqanal/). ClustalW alignments were processed using Boxshade (http://www.ch.embnet.org/software/BOX_form.html). All BLAST searches were performed using the NCBI BLAST feature. For some searches the advanced BLAST feature was used and a target organism was specified. Targeting signals and putative localization were predicted using PSORT (http://psort.nibb.ac.jp/). Domains were identified using SMART (http://smart.embl-heidelberg.de/). Coiled-coils were identified using COILS prediction program (http://dot.imgen.bcm.tmc.edu:9331/seq-search/struc-predict.html). Secondary structure prediction was performed using CONSENSUS (http ://pbil.ibcp.fr/NPSA/npsa_prediction.html). A tertiary structure prediction was performed using the Swiss Model prediction program (http ://www.expasy.ch/swissmod/SWISS-MODEL.html).

Results

[0133] Cloning of ZmMBD1 and ZmMBD2

[0134] The methyl-CpG-binding domain (MBD) of human MeCP2 was used in a TBLASTN search of the public maize EST database (www.zmdb.iastate.edu). Putative hits were then used in BLASTX searches to determine if putative methyl-CpG-binding domains were present. Two distinct contigs, named ZmMBD1 and ZmMBD2, contained domains that identified MBD proteins in BLASTX searches (www.ncbi.nlm.nih.gov/BLAST). ZmMBD1 was represented by the ESTs AI770580 and AI881737 which are from an ear tissue library (See, Table 1 below). ZmMBD2 was represented by ESTs AI668487and AI881737 from an ear and an endosperm library. All ESTs corresponding to ZmMBD1 and ZmMBD2 were in the reverse orientation.

[0135] The assembled ZmMBD1 nucleic acid is 1773 base pairs (bp) in length (FIG. 1A). The deduced protein encoded by the ZmMBD1 nucleic acid is 433 amino acids in length (See FIG. 1B). The completed nucleic acid sequence was used to BLAST all maize EST sequences. Nine ESTs (including the initial two hits) representing ZmMBD1 were present in the public maize EST collection (See, Table 1A below). The seven hits all represented the 3′ end of the nucleic acid. The region of identity between the EST and the putative protein is indicated.

[0136] The nucleic acid sequence of ZmMBD2 is 1728 bp with an open reading frame from bp139 to 1426 (FIG. 2), which encodes a ZmMBD2 protein having a length of 428 amino acids. Using the fall length ZmMBD2 nucleic acid sequence for a BLAST search of maize ESTs revealed the presence of 12 ZmMBD2 EST's (See Table 1B, below). The ZmMBD2 ESTs can be assembled into a contig that contains the entire coding sequence. The EST contig is 99% identical to ZmMBD2 (the 1% differences are likely due to EST sequencing errors). Both ZmMBD1 and ZmMBD2 are believed to reside in the nucleus by the localization prediction program PSORT (Nakai and Kanehisa, Genomics, 14:897-911 1992). A putative bipartite nuclear localization signal was found at amino acids 91-118 (See, FIGS. 1B and 2B). To test for the presence of recognizable domains, the ZmMBD1 and ZmMBD2 nucleic acid sequences were submitted to the SMART server (Schultz et al., Proc. Natl. Acad. Sci. USA, 95: 5857-5864 (1998); Schultz et al., Nucl. Acids Res., 28: 231-234 (2000)). A methyl-CpG-binding domain was found at amino acids 1-60 of both proteins. A coiled-coil domain was found at amino acid 156-170 of ZmMBD1 and 139-152 of ZmMBD2 by COILS (Lupas et al., Science, 252(5010):1162-4 (1991)). Coiled-coil domains are often involved in protein-protein interactions. ZmMBD1 protein contains 11 tandem repeats of a 16 amino acid proline/alanine rich unit from amino acids 203-378 (FIGS. 3 and 4A). The consensus amino acid sequence from the ZmMBD1 protein repeats is tksdaePAavAAPape (single letter amino acid code with lower case letters representing residue present in majority of repeat as and upper case letters indicating residues that are completely conserved). A very similar organization is observed for ZmMBD2. FIG. 4B shows an alignment of the ZmMBD2 protein tandem repeats, which contain the consensus sequence tksvAepAAvaaPape. The consensus sequence of the ZmMBD1 and ZmMBD2 protein repeats is identical except at position 4. The CONSENSUS secondary structure prediction program suggests the presence of alpha-helices in these repeats (http://pbil.ibcp.fr/NPSA/npsa_prediction.html).

[0137] The sequences of ZmMBD1 and ZMBD2 have substantial identity to each other over the entire coding region. The nucleotide sequences of the cDNA's are 76.5% identical over their entire length. The putative proteins are 74% identical and 78% similar. The N-terminal 100 amino acids display substantial identity to one another (96%) (See, FIG. 5). The C-terminal regions are not as highly conserved.

[0138] ZmMBD1 and ZmMBD2 proteins were isolated based on the potential methyl-CpG-binding domains. The similarity between ZmMBD1 and ZmMBD2 proteins and the mammalian MBD proteins is found in amino acids 1-60. FIG. 6A shows the alignment of methyl-CpG-binding domains from a number of proteins. The NMR solution structures for the methyl-CpG-binding domain of MeCP2 and MBD1 have been solved (Wakefield et al., J. Mol. Bio., 291:1055-1065 (1999), Ohki et al., EMBO J., 18:6653-6661 (1999)). Amino acids that have been identified as important for proper folding and function are indicated in FIG. 6B. Most of these residues are conserved or conservatively substituted in the ZmMBD proteins.

[0139] The expression pattern of ZmMBD1 and ZmMBD2 proteins was assessed by using RT-PCR detect ZmMBD1 and ZmMBD2 protein transcripts in several maize tissues (FIG. 7). ZmMBD1 protein transcripts were detected in all tissues except pollen (which did not yield significant amounts of product with the ubiquitin control reactions). Root cDNA and BMS cell culture cDNA showed a lower level of expression relative to other tissues. The expression pattern of ZmMBD2 protein is similar to that of ZmMBD1 protein except in the leaf tissue. ZmMBD1 protein is expressed in mature leaves, ZmMBD2 protein is not.

[0140] The origins of the RNA used to generate the ZmMBD1 and ZmMBD2 ESTs also give an indication of expression patterns. Table 2 indicates the origin of the 53,000 maize EST's according to their cDNA library. Eight of the nine ZmMBD1 ESTs are from the ear library, although this library only comprises 10.5% of all ESTs. This indicates high expression in ear tissue relative to other tissues in the EST collection. ZmMBD2 EST's were found in ear (5 ESTs), endosperm (5 ESTs) and mixed stages pollen and anther (2 ESTs) tissue. These three libraries compose only 30% of all EST sequences. Together the EST sequences indicate high expression in the ear for ZmMBD1 protein and high expression in the ear, endosperm and anther/pollen for ZmMBD2 protein. The sheer number of ZmMBD ESTs suggests that ZmMBD1 and ZmMBD2 protein are each highly expressed. TABLE 1 ZmMBD ESTs A.ZmMBD1 ESTs Amino acids included Tissue source of Accession number in EST Sequence cDNA library Al770580   1-161 Ear Al881737   1-109 Ear Al714501 332-433 Ear Al834699 336-433 Ear Al691917 342-433 Ear Al738352 349-433 Ear Al834474 352-433 Ear Al901411 383-433 Tassel Al666009 383-433 Ear B.ZmMBD2 ESTs Al881737 1-109 Ear Al668487 1-161 Endosperm Al854908 41-169 Endosperm Al770834 138-329 Ear AW499227 186-313 Mixed pollen and anther Al734531 306-428 Ear AW498214 317-428 Mixed pollen and anther Al665167 310-428 Endosperm Al666051 332-428 Ear Al833694 349-428 Endosperm Al677051 356-428 Endosperm Al738201 357-428 Ear

[0141] TABLE 2 Distribution of Maize ESTs by library Total ZmMBD1 ZmMBD2 Library Tissue source of cDNA library ESTs % of total ESTs ESTs 486 Leaf primordia 5,868 11.1% 0 0 487 Apical meristem 684 1.3% 0 0 496 Stressed shoot 1,307 2.5% 0 0 603 Stressed root 2,023 3.9% 0 0 605 Endosperm 6,566 12.5% 0 5 606 Ear 5,523 10.5% 8 5 614 Root 10,652 20.3% 0 0 618 Tassel 3,408 6.5% 1 0 660 Mixed stages anther and pollen 4,651 8.9% 0 2 683 14 day immature embryo 1,138 2.2% 0 0 687 Early embryo 4,937 9.4% 0 0 707 Mixed adult tissues 5,436 10.4% 0 0 829 Silk infected with fusarium 244 0.5% 0 0 Total 52437 9 12

[0142] All references, patents and patent applications referred to herein are hereby incorporated by reference.

[0143] The present invention is illustrated by way of the foregoing description and examples. The foregoing description is intended as a non-limiting illustration, since many variations will become apparent to those skilled in the art in view thereof. It is intended that all such variations within the scope and spirit of the appended claims be embraced thereby.

[0144] Changes can be made to the composition, operation and arrangement of the method of the present invention described herein without departing from the concept and scope of the invention as defined in the following claims.

1 13 1 1770 DNA Zea mays 1 gaattcacta gtgattcgag agcgagagca aagagctgag cgggacggag cagctagcgc 60 tagggataac gcacacccac caccatggcg acgcccggag agcagccggc ggaggtcgtg 120 tccgtcgaga tgcccgcacc cgacgggtgg accaagaagt ttactccctt gagagggggg 180 agatctgaga ttgtttttgt ttcaccaact ggcgaggaaa ttaagaacaa gaggcaatta 240 agtcagtacc taaaggcaca ccctggaggc cctgctgttt cagagtttga ttggggaact 300 ggtgataccc caaggcgttc tgctcgtatt agcgagaaag tcaaggtatt tgatagccca 360 gagggcgaga agatcccgaa gcgcagcagg aactccagtg gtaggaaggg taagcagggg 420 aagaaggaaa cccctgagac cgaagaagcc aaagatgctg aaactggcaa ggaggcagag 480 gaggccccaa gcgaagatgc cgcaaaggag actgatgtgg agatgaagcc tgctgaagag 540 gtgaaggggg cctctgctga aacagaagat gctgacatgg ctgatgctcc tgcaccagca 600 ccaatggaag aagataagaa acaaactgaa gaactggcag aagctattgc agctcctcct 660 gtgccatcgg aggagaagaa agatgtcaag ccagctgagc ctgaagctgc agcaagcaac 720 ccaactgagg attcggcccc tgctcctgct gaacctgctg atgttgctgc tccagctgct 780 gagaccaaat cagacgccaa acctgctgct gttgctgctc cagtgcctga gaccaaatca 840 gacgccgagc ctgctgctgt tgctgctcca gcgcctgaga ccaaatcaga cgctgagcct 900 gctgctgttg ctgctccagc tcctgagacc aaatcagtag ccgagcctgc tgctgttgct 960 gctccagcgc ctgagaccaa atcagatgcc gagcctgctg ctgttgctgc tccagtgcct 1020 gagaccaaat cagatgccga gcctgctgct gttgctgctc cagtgcctga aacgaaatca 1080 gatgctgagc ctgccgctga tgctgctcca gttcctgaga tgaaatcaga gtccgagcct 1140 gctgctgttg ctgctccagc gtctgaaacc aaatcagacg ccgagcctgc tgccgttgct 1200 gctccagcgc ctgagaccaa atcagatgcc gagcctgctg ctgctgctgc tccagtgcct 1260 ggaaccaact cagatgctgc tgctactgat ccagcacctg gaaccaaggc cactgccgcc 1320 gatccagcac ctggagctcc agcagagaac tccaccgaca aagacggaag ccaggagagc 1380 cagcccgtga acaatggaca gctgccgcac tcgacggtga agtgcacctg aagccgcaac 1440 gccggaaaaa cctgaacccc ttccggcgcc gtgtatctca tttagggaca aacattcgca 1500 ttcgcattca cattgttatc agttaaaaag tctatggaac ggtaagcatg ttaattagtc 1560 agtctgacgc tgctgggtga cgggttggta attaggctct gcatttagct gttttgtatg 1620 gctgctgtcg ccgccataca tgtatcatct tcttgtggtt gtatggtacc taaccgcgta 1680 aacatgttaa ccttaggcca tgtattaatt actccgctct aatgtttgta tgtgtattgt 1740 caaacttgca gaaccagttg catgcatgaa 1770 2 433 PRT Zea mays 2 Met Pro Ala Pro Asp Gly Trp Thr Lys Lys Phe Thr Pro Leu Arg Gly 1 5 10 15 Gly Arg Ser Glu Ile Val Phe Val Ser Pro Thr Gly Glu Glu Ile Lys 20 25 30 Asn Lys Arg Gln Leu Ser Gln Tyr Leu Lys Ala His Pro Gly Gly Pro 35 40 45 Ala Val Ser Glu Phe Asp Trp Gly Thr Gly Asp Thr Pro Arg Arg Ser 50 55 60 Ala Arg Ile Ser Glu Lys Val Lys Val Phe Asp Ser Pro Glu Gly Glu 65 70 75 80 Lys Ile Pro Lys Arg Ser Arg Asn Ser Ser Gly Arg Lys Gly Lys Gln 85 90 95 Gly Lys Lys Glu Thr Pro Glu Thr Glu Glu Ala Lys Asp Ala Glu Thr 100 105 110 Gly Lys Glu Ala Glu Glu Ala Pro Ser Glu Asp Ala Ala Lys Glu Thr 115 120 125 Asp Val Glu Met Lys Pro Ala Glu Glu Val Lys Gly Ala Ser Ala Glu 130 135 140 Thr Glu Asp Ala Asp Met Ala Asp Ala Pro Ala Pro Ala Pro Met Glu 145 150 155 160 Glu Asp Lys Lys Gln Thr Glu Glu Leu Ala Glu Ala Ile Ala Ala Pro 165 170 175 Pro Val Pro Ser Glu Glu Lys Lys Asp Val Lys Pro Ala Glu Pro Glu 180 185 190 Ala Ala Ala Ser Asn Pro Thr Glu Asp Ser Ala Pro Ala Pro Ala Glu 195 200 205 Pro Ala Asp Val Ala Ala Pro Ala Ala Glu Thr Lys Ser Asp Ala Lys 210 215 220 Pro Ala Ala Val Ala Ala Pro Val Pro Glu Thr Lys Ser Asp Ala Glu 225 230 235 240 Pro Ala Ala Val Ala Ala Pro Ala Pro Glu Thr Lys Ser Asp Ala Glu 245 250 255 Pro Ala Ala Val Ala Ala Pro Ala Pro Glu Thr Lys Ser Val Ala Glu 260 265 270 Pro Ala Ala Val Ala Ala Pro Ala Pro Glu Thr Lys Ser Asp Ala Glu 275 280 285 Pro Ala Ala Val Ala Ala Pro Val Pro Glu Thr Lys Ser Asp Ala Glu 290 295 300 Pro Ala Ala Val Ala Ala Pro Val Pro Glu Thr Lys Ser Asp Ala Glu 305 310 315 320 Pro Ala Ala Asp Ala Ala Pro Val Pro Glu Met Lys Ser Glu Ser Glu 325 330 335 Pro Ala Ala Val Ala Ala Pro Ala Ser Glu Thr Lys Ser Asp Ala Glu 340 345 350 Pro Ala Ala Val Ala Ala Pro Ala Pro Glu Thr Lys Ser Asp Ala Glu 355 360 365 Pro Ala Ala Ala Ala Ala Pro Val Pro Gly Thr Asn Ser Asp Ala Ala 370 375 380 Ala Thr Asp Pro Ala Pro Gly Thr Lys Ala Thr Ala Ala Asp Pro Ala 385 390 395 400 Pro Gly Ala Pro Ala Glu Asn Ser Thr Asp Lys Asp Gly Ser Gln Glu 405 410 415 Ser Gln Pro Val Asn Asn Gly Gln Leu Pro His Ser Thr Val Lys Cys 420 425 430 Thr 3 1728 DNA Zea mays 3 tcgagcggcc gcccgggcag gtcaagcggc gagagcggga cggggcagag caagagctag 60 ggataaccct cgcccaccat ggcgacgccc ggcgagcagc aggcaccggc tgcggcggag 120 taggtcgtgt ccgtcgagat gcctgcaccc gacgggtgga ccaagaagtt tactccccag 180 agagggggaa gatccgagat tgtttttgtt tcgccaactg gcgaggaaat taagaacaag 240 aggcaactaa gccaatacct aaaggcacac cctggaggcc ctgctgcttc agattttgat 300 tggggaactg gtgatacccc aaggcgttct gctcgcatta gcgagaaagt caaggttttt 360 gatagcccag agggcgagaa gatcccgaaa cgcagcagga actccagtgg taggaagggt 420 aggcagggaa agaaggaagc ccctgaaact gaagaagcca aagatgctga aaccggccag 480 gacgccccaa gtgaagatgg cacaaaggag actgacgtgg agatgaagcc tgctgaagag 540 gcgaaggaag ctcctactga aactgatgac gctgagaagg ctgcagacaa ggcggacgat 600 actcctgctc cggcgccaat ggaagaagat gagaaagaaa ctgagaaacc agctgaagct 660 gttgtagctc ctcttgcgca atcggaggag aagaaagaag atgccaagcc agatgagcct 720 gaagctgtgg ctccagctcc agtaagcaac ccaactgaga actcagcccc tgctcctgcc 780 gagcctgctg ctgttcctgc cccagtgcct gagaccgaat cagttgccga gcctgctgct 840 gttctcgccc cagcgcctga aaccaaacca gatgccaagc ctgctgctgt tcctgcccca 900 gcgcctgaaa acaaaccaga tgccgagcct gctgctgctg ctgctccagt gcctgacacc 960 aaatcagttg ctgagcctgc tgctgctcca gcgcctgaca ccaaatcagt tgctgaacct 1020 gctgctgctg ctccagtgcc cgagaccaaa ctagttgctg aatctgctgc tgatgctgtt 1080 gctgctccgg cgcctgaaac caaatcagat gccgagcctg ctgctgctcc agtgcccgag 1140 accaaaccag ttgctgaatc tgctgctgat gctgttgctg ctccagcgcc tgaaaccaaa 1200 tcagatgccg agcctgctgc tgccgctgat ccagcacctg aaatcaaatc agatgctgcc 1260 gccgctgatc cagcacccgg gaccaaggca gatgctgccg ccactgatgc cgcgcctgga 1320 gccgagccag acgccgctcc actagagaac accgctgccg acaaaggcgg aagcgaggag 1380 agcagccagc ccgtgaacaa cgtgaacaac gggcactcaa cgtgaagtgc atctgaggcc 1440 ggaacggaac accccttcca gcaccgtgta tctcatgtag gaacaaacat ttgcattcgc 1500 attgtaatct gatggatcgg taagcatgtt gattagtcag tctgacgctg ctgggtgaca 1560 tgttggtaat taggctctgc atttgagctc tttttttttg tatggctgct gtcgccgccg 1620 tacatgtatc ctgtatccac caccatcatc atccccttgg actattggtt gtgttgtacc 1680 taactgcgta agcatattat aaaaatacct gcccgggcgg ccgctcga 1728 4 428 PRT Zea mays 4 Met Pro Ala Pro Asp Gly Trp Thr Lys Lys Phe Thr Pro Gln Arg Gly 1 5 10 15 Gly Arg Ser Glu Ile Val Phe Val Ser Pro Thr Gly Glu Glu Ile Lys 20 25 30 Asn Lys Arg Gln Leu Ser Gln Tyr Leu Lys Ala His Pro Gly Gly Pro 35 40 45 Ala Ala Ser Asp Phe Asp Trp Gly Thr Gly Asp Thr Pro Arg Arg Ser 50 55 60 Ala Arg Ile Ser Glu Lys Val Lys Val Phe Asp Ser Pro Glu Gly Glu 65 70 75 80 Lys Ile Pro Lys Arg Ser Arg Asn Ser Ser Gly Arg Lys Gly Arg Gln 85 90 95 Gly Lys Lys Glu Ala Pro Glu Thr Glu Glu Ala Lys Asp Ala Glu Thr 100 105 110 Gly Gln Asp Ala Pro Ser Glu Asp Gly Thr Lys Glu Thr Asp Val Glu 115 120 125 Met Lys Pro Ala Glu Glu Ala Lys Glu Ala Pro Thr Glu Thr Asp Asp 130 135 140 Ala Glu Lys Ala Ala Asp Lys Ala Asp Asp Thr Pro Ala Pro Ala Pro 145 150 155 160 Met Glu Glu Asp Glu Lys Glu Thr Glu Lys Pro Ala Glu Ala Val Val 165 170 175 Ala Pro Leu Ala Gln Ser Glu Glu Lys Lys Glu Asp Ala Lys Pro Asp 180 185 190 Glu Pro Glu Ala Val Ala Pro Ala Pro Val Ser Asn Pro Thr Glu Asn 195 200 205 Ser Ala Pro Ala Pro Ala Glu Pro Ala Ala Val Pro Ala Pro Val Pro 210 215 220 Glu Thr Glu Ser Val Ala Glu Pro Ala Ala Val Leu Ala Pro Ala Pro 225 230 235 240 Glu Thr Lys Pro Asp Ala Lys Pro Ala Ala Val Pro Ala Pro Ala Pro 245 250 255 Glu Asn Lys Pro Asp Ala Glu Pro Ala Ala Ala Ala Ala Pro Val Pro 260 265 270 Asp Thr Lys Ser Val Ala Glu Pro Ala Ala Ala Pro Ala Pro Asp Thr 275 280 285 Lys Ser Val Ala Glu Pro Ala Ala Ala Ala Pro Val Pro Glu Thr Lys 290 295 300 Leu Val Ala Glu Ser Ala Ala Asp Ala Val Ala Ala Pro Ala Pro Glu 305 310 315 320 Thr Lys Ser Asp Ala Glu Pro Ala Ala Ala Pro Val Pro Glu Thr Lys 325 330 335 Pro Val Ala Glu Ser Ala Ala Asp Ala Val Ala Ala Pro Ala Pro Glu 340 345 350 Thr Lys Ser Asp Ala Glu Pro Ala Ala Ala Ala Asp Pro Ala Pro Glu 355 360 365 Ile Lys Ser Asp Ala Ala Ala Ala Asp Pro Ala Pro Gly Thr Lys Ala 370 375 380 Asp Ala Ala Ala Thr Asp Ala Ala Pro Gly Ala Glu Pro Asp Ala Ala 385 390 395 400 Pro Leu Glu Asn Thr Ala Ala Asp Lys Gly Gly Ser Glu Glu Ser Ser 405 410 415 Gln Pro Val Asn Asn Val Asn Asn Gly His Ser Thr 420 425 5 25 DNA Zea mays 5 cgagagcgag agcaaagagc tgagc 25 6 24 DNA Zea mays 6 ctctgcctcc ttgccagttt cagc 24 7 26 DNA Zea mays 7 gggcagagca agagctaggg ataacc 26 8 26 DNA Zea mays 8 catctccacg tcagtctcct ttgtgc 26 9 24 DNA Zea mays 9 ggagactgat gtggagatga agcc 24 10 22 DNA Zea mays 10 gttgcggctt caggtgcact tc 22 11 21 DNA Zea mays 11 gcaacccaac tgaggattcg g 21 12 22 DNA Zea mays 12 cctgctgaag aggcgaagga ag 22 13 21 DNA Zea mays 13 gcggtgttct ctagtggagc g 21 

What is claimed is:
 1. An isolated and purified nucleic acid comprising a polynucleotide selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3 and conservatively modified and polymorphic variants thereof.
 2. The isolated and purified nucleotide acid of claim 1, wherein the polynucleotide is at least 15 nucleotides in length.
 3. An isolated and purified nucleic acid comprising a polynucleotide having at least 60% identity to a polynucleotide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO:
 3. 4. An isolated and purified nucleic acid comprising a polynucleotide having at least 70% identity to a polynucleotide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO:
 3. 5. An isolated and purified nucleic acid comprising a polynucleotide having at least 80% identity to a polynucleotide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO:
 3. 6. An isolated and purified nucleic acid comprising a polynucleotide having at least 90% identity to a polynucleotide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO:
 3. 7. An isolated and purified nucleic acid comprising a polynucleotide having at least 95% identity to a polynucleotide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO:
 3. 8. An isolated and purified polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4 and conservatively modified and polymorphic variants thereof.
 9. An isolated and purified polypeptide comprising an amino acid sequence having at least 60% identity to an amino acid sequence selected from the group consisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 10. An isolated and purified polypeptide comprising an amino acid sequence having at least 70% identity to an amino acid sequence selected from the group consisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 11. An isolated and purified polypeptide comprising an amino acid sequence having at least 80% identity to an amino acid sequence selected from the group consisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 12. An isolated and purified polypeptide comprising an amino acid sequence having at least 90% identity to an amino acid sequence selected from the group consisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 13. An isolated and purified polypeptide comprising an amino acid sequence having at least 95% identity to an amino acid sequence selected from the group consisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 14. An expression cassette comprising a promoter sequence operably linked to a nucleic acid of claim
 1. 15. The expression cassette of claim 14 further comprising a polyadenylation signal operably linked to the polynucleotide.
 16. The expression cassette of claim 14 wherein the promoter is a constitutive or tissue specific promoter.
 17. A bacterial cell comprising the expression cassette of claim
 14. 18. The bacterial cell of claim 17 wherein the bacterial cell is an Agrobacterium tumefaciens cell or an Agrobacterium rhizogenes cell.
 19. A plant cell transformed with the expression cassette of claim
 14. 20. A transformed plant containing the plant cell of claim
 19. 21. The transformed plant of claim 20 wherein the plant is Zea mays.
 22. Seed from the transformed plant of claim
 20. 23. Transformed plant seed containing the plant cell of claim
 20. 